Dataset statistics
| Number of variables | 16 |
|---|---|
| Number of observations | 29531 |
| Missing cells | 88488 |
| Missing cells (%) | 18.7% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 3.6 MiB |
| Average record size in memory | 128.0 B |
Variable types
| Categorical | 3 |
|---|---|
| Numeric | 13 |
Date has a high cardinality: 2009 distinct values | High cardinality |
PM2.5 is highly correlated with PM10 and 1 other fields | High correlation |
PM10 is highly correlated with PM2.5 and 4 other fields | High correlation |
NO is highly correlated with PM10 and 1 other fields | High correlation |
NO2 is highly correlated with PM10 and 1 other fields | High correlation |
NOx is highly correlated with PM10 and 2 other fields | High correlation |
CO is highly correlated with AQI | High correlation |
Benzene is highly correlated with Toluene and 1 other fields | High correlation |
Toluene is highly correlated with Benzene and 1 other fields | High correlation |
Xylene is highly correlated with Benzene and 1 other fields | High correlation |
AQI is highly correlated with PM2.5 and 2 other fields | High correlation |
PM2.5 is highly correlated with PM10 and 1 other fields | High correlation |
PM10 is highly correlated with PM2.5 and 3 other fields | High correlation |
NO is highly correlated with PM10 and 1 other fields | High correlation |
NO2 is highly correlated with NOx and 1 other fields | High correlation |
NOx is highly correlated with PM10 and 2 other fields | High correlation |
CO is highly correlated with AQI | High correlation |
Benzene is highly correlated with Toluene | High correlation |
Toluene is highly correlated with Benzene | High correlation |
AQI is highly correlated with PM2.5 and 3 other fields | High correlation |
PM2.5 is highly correlated with PM10 and 1 other fields | High correlation |
PM10 is highly correlated with PM2.5 and 1 other fields | High correlation |
NO is highly correlated with NOx | High correlation |
NOx is highly correlated with NO | High correlation |
Benzene is highly correlated with Toluene and 1 other fields | High correlation |
Toluene is highly correlated with Benzene | High correlation |
Xylene is highly correlated with Benzene | High correlation |
AQI is highly correlated with PM2.5 and 1 other fields | High correlation |
City is highly correlated with PM10 and 5 other fields | High correlation |
PM2.5 is highly correlated with PM10 and 2 other fields | High correlation |
PM10 is highly correlated with City and 4 other fields | High correlation |
NO is highly correlated with PM10 and 1 other fields | High correlation |
NO2 is highly correlated with AQI | High correlation |
NOx is highly correlated with NO | High correlation |
NH3 is highly correlated with City | High correlation |
CO is highly correlated with City and 2 other fields | High correlation |
SO2 is highly correlated with City and 2 other fields | High correlation |
Benzene is highly correlated with Toluene | High correlation |
Toluene is highly correlated with Benzene | High correlation |
AQI is highly correlated with City and 6 other fields | High correlation |
AQI_Bucket is highly correlated with City and 3 other fields | High correlation |
PM2.5 has 4598 (15.6%) missing values | Missing |
PM10 has 11140 (37.7%) missing values | Missing |
NO has 3582 (12.1%) missing values | Missing |
NO2 has 3585 (12.1%) missing values | Missing |
NOx has 4185 (14.2%) missing values | Missing |
NH3 has 10328 (35.0%) missing values | Missing |
CO has 2059 (7.0%) missing values | Missing |
SO2 has 3854 (13.1%) missing values | Missing |
O3 has 4022 (13.6%) missing values | Missing |
Benzene has 5623 (19.0%) missing values | Missing |
Toluene has 8041 (27.2%) missing values | Missing |
Xylene has 18109 (61.3%) missing values | Missing |
AQI has 4681 (15.9%) missing values | Missing |
AQI_Bucket has 4681 (15.9%) missing values | Missing |
Benzene is highly skewed (γ1 = 21.30421849) | Skewed |
NOx has 740 (2.5%) zeros | Zeros |
CO has 2328 (7.9%) zeros | Zeros |
Benzene has 3802 (12.9%) zeros | Zeros |
Toluene has 2861 (9.7%) zeros | Zeros |
Xylene has 1747 (5.9%) zeros | Zeros |
Reproduction
| Analysis started | 2021-12-26 14:32:28.502799 |
|---|---|
| Analysis finished | 2021-12-26 14:32:49.630384 |
| Duration | 21.13 seconds |
| Software version | pandas-profiling v3.1.1 |
| Download configuration | config.json |
| Distinct | 26 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 230.8 KiB |
| Mumbai | |
|---|---|
| Delhi | |
| Chennai | |
| Bengaluru | |
| Lucknow | |
| Other values (21) |
Length
| Max length | 18 |
|---|---|
| Median length | 8 |
| Mean length | 8.275744133 |
| Min length | 5 |
Characters and Unicode
| Total characters | 244391 |
|---|---|
| Distinct characters | 38 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Ahmedabad |
|---|---|
| 2nd row | Ahmedabad |
| 3rd row | Ahmedabad |
| 4th row | Ahmedabad |
| 5th row | Ahmedabad |
Common Values
| Value | Count | Frequency (%) |
| Mumbai | 2009 | 6.8% |
| Delhi | 2009 | 6.8% |
| Chennai | 2009 | 6.8% |
| Bengaluru | 2009 | 6.8% |
| Lucknow | 2009 | 6.8% |
| Ahmedabad | 2009 | 6.8% |
| Hyderabad | 2006 | 6.8% |
| Patna | 1858 | 6.3% |
| Gurugram | 1679 | 5.7% |
| Visakhapatnam | 1462 | 5.0% |
| Other values (16) | 10472 |
Length
| Value | Count | Frequency (%) |
| mumbai | 2009 | 6.8% |
| delhi | 2009 | 6.8% |
| chennai | 2009 | 6.8% |
| bengaluru | 2009 | 6.8% |
| lucknow | 2009 | 6.8% |
| ahmedabad | 2009 | 6.8% |
| hyderabad | 2006 | 6.8% |
| patna | 1858 | 6.3% |
| gurugram | 1679 | 5.7% |
| visakhapatnam | 1462 | 5.0% |
| Other values (16) | 10472 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 46303 | |
| r | 21033 | 8.6% |
| u | 15396 | 6.3% |
| n | 15294 | 6.3% |
| h | 13678 | 5.6% |
| i | 13664 | 5.6% |
| e | 11353 | 4.6% |
| m | 10991 | 4.5% |
| d | 8334 | 3.4% |
| t | 8306 | 3.4% |
| Other values (28) | 80039 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 214860 | |
| Uppercase Letter | 29531 | 12.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 46303 | |
| r | 21033 | |
| u | 15396 | 7.2% |
| n | 15294 | 7.1% |
| h | 13678 | 6.4% |
| i | 13664 | 6.4% |
| e | 11353 | 5.3% |
| m | 10991 | 5.1% |
| d | 8334 | 3.9% |
| t | 8306 | 3.9% |
| Other values (13) | 50508 |
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 4294 | |
| B | 3236 | |
| C | 2699 | |
| J | 2283 | |
| G | 2181 | |
| T | 2037 | |
| M | 2009 | |
| L | 2009 | |
| D | 2009 | |
| H | 2006 | |
| Other values (5) | 4768 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 244391 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 46303 | |
| r | 21033 | 8.6% |
| u | 15396 | 6.3% |
| n | 15294 | 6.3% |
| h | 13678 | 5.6% |
| i | 13664 | 5.6% |
| e | 11353 | 4.6% |
| m | 10991 | 4.5% |
| d | 8334 | 3.4% |
| t | 8306 | 3.4% |
| Other values (28) | 80039 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 244391 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| a | 46303 | |
| r | 21033 | 8.6% |
| u | 15396 | 6.3% |
| n | 15294 | 6.3% |
| h | 13678 | 5.6% |
| i | 13664 | 5.6% |
| e | 11353 | 4.6% |
| m | 10991 | 4.5% |
| d | 8334 | 3.4% |
| t | 8306 | 3.4% |
| Other values (28) | 80039 |
| Distinct | 2009 |
|---|---|
| Distinct (%) | 6.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 230.8 KiB |
| 2020-05-13 | 26 |
|---|---|
| 2020-06-18 | 26 |
| 2020-06-21 | 26 |
| 2020-05-22 | 26 |
| 2020-04-14 | 26 |
| Other values (2004) |
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 10 |
| Min length | 10 |
Characters and Unicode
| Total characters | 295310 |
|---|---|
| Distinct characters | 11 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2015-01-01 |
|---|---|
| 2nd row | 2015-01-02 |
| 3rd row | 2015-01-03 |
| 4th row | 2015-01-04 |
| 5th row | 2015-01-05 |
Common Values
| Value | Count | Frequency (%) |
| 2020-05-13 | 26 | 0.1% |
| 2020-06-18 | 26 | 0.1% |
| 2020-06-21 | 26 | 0.1% |
| 2020-05-22 | 26 | 0.1% |
| 2020-04-14 | 26 | 0.1% |
| 2020-05-27 | 26 | 0.1% |
| 2020-06-13 | 26 | 0.1% |
| 2020-05-16 | 26 | 0.1% |
| 2020-05-25 | 26 | 0.1% |
| 2020-03-28 | 26 | 0.1% |
| Other values (1999) | 29271 |
Length
| Value | Count | Frequency (%) |
| 2020-05-13 | 26 | 0.1% |
| 2020-06-26 | 26 | 0.1% |
| 2020-03-11 | 26 | 0.1% |
| 2020-06-11 | 26 | 0.1% |
| 2020-03-19 | 26 | 0.1% |
| 2020-04-09 | 26 | 0.1% |
| 2020-04-06 | 26 | 0.1% |
| 2020-04-25 | 26 | 0.1% |
| 2020-05-09 | 26 | 0.1% |
| 2020-04-01 | 26 | 0.1% |
| Other values (1999) | 29271 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 70666 | |
| - | 59062 | |
| 2 | 51607 | |
| 1 | 49700 | |
| 9 | 12479 | 4.2% |
| 8 | 11560 | 3.9% |
| 7 | 9799 | 3.3% |
| 6 | 9198 | 3.1% |
| 5 | 8529 | 2.9% |
| 3 | 7101 | 2.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 236248 | |
| Dash Punctuation | 59062 | 20.0% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 70666 | |
| 2 | 51607 | |
| 1 | 49700 | |
| 9 | 12479 | 5.3% |
| 8 | 11560 | 4.9% |
| 7 | 9799 | 4.1% |
| 6 | 9198 | 3.9% |
| 5 | 8529 | 3.6% |
| 3 | 7101 | 3.0% |
| 4 | 5609 | 2.4% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 59062 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 295310 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 70666 | |
| - | 59062 | |
| 2 | 51607 | |
| 1 | 49700 | |
| 9 | 12479 | 4.2% |
| 8 | 11560 | 3.9% |
| 7 | 9799 | 3.3% |
| 6 | 9198 | 3.1% |
| 5 | 8529 | 2.9% |
| 3 | 7101 | 2.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 295310 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 70666 | |
| - | 59062 | |
| 2 | 51607 | |
| 1 | 49700 | |
| 9 | 12479 | 4.2% |
| 8 | 11560 | 3.9% |
| 7 | 9799 | 3.3% |
| 6 | 9198 | 3.1% |
| 5 | 8529 | 2.9% |
| 3 | 7101 | 2.4% |
| Distinct | 11716 |
|---|---|
| Distinct (%) | 47.0% |
| Missing | 4598 |
| Missing (%) | 15.6% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 67.45057795 |
| Minimum | 0.04 |
|---|---|
| Maximum | 949.99 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 230.8 KiB |
Quantile statistics
| Minimum | 0.04 |
|---|---|
| 5-th percentile | 13.206 |
| Q1 | 28.82 |
| median | 48.57 |
| Q3 | 80.59 |
| 95-th percentile | 193.96 |
| Maximum | 949.99 |
| Range | 949.95 |
| Interquartile range (IQR) | 51.77 |
Descriptive statistics
| Standard deviation | 64.66144946 |
|---|---|
| Coefficient of variation (CV) | 0.9586493018 |
| Kurtosis | 21.13222159 |
| Mean | 67.45057795 |
| Median Absolute Deviation (MAD) | 23.43 |
| Skewness | 3.369959851 |
| Sum | 1681745.26 |
| Variance | 4181.103046 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 11 | 19 | 0.1% |
| 20.75 | 12 | < 0.1% |
| 27.82 | 11 | < 0.1% |
| 15 | 10 | < 0.1% |
| 11.81 | 10 | < 0.1% |
| 28.45 | 10 | < 0.1% |
| 29.75 | 10 | < 0.1% |
| 47.43 | 10 | < 0.1% |
| 18.81 | 10 | < 0.1% |
| 18.36 | 9 | < 0.1% |
| Other values (11706) | 24822 | |
| (Missing) | 4598 | 15.6% |
| Value | Count | Frequency (%) |
| 0.04 | 1 | |
| 0.16 | 1 | |
| 0.24 | 1 | |
| 0.28 | 1 | |
| 0.98 | 1 | |
| 0.99 | 1 | |
| 1.14 | 1 | |
| 1.19 | 1 | |
| 1.25 | 1 | |
| 1.39 | 1 |
| Value | Count | Frequency (%) |
| 949.99 | 1 | |
| 917.77 | 1 | |
| 916.67 | 1 | |
| 914.94 | 1 | |
| 914.64 | 1 | |
| 894.75 | 1 | |
| 868.66 | 1 | |
| 858.73 | 1 | |
| 832.8 | 1 | |
| 821.42 | 1 |
| Distinct | 12571 |
|---|---|
| Distinct (%) | 68.4% |
| Missing | 11140 |
| Missing (%) | 37.7% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 118.1271029 |
| Minimum | 0.01 |
|---|---|
| Maximum | 1000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 230.8 KiB |
Quantile statistics
| Minimum | 0.01 |
|---|---|
| 5-th percentile | 26.365 |
| Q1 | 56.255 |
| median | 95.68 |
| Q3 | 149.745 |
| 95-th percentile | 303.34 |
| Maximum | 1000 |
| Range | 999.99 |
| Interquartile range (IQR) | 93.49 |
Descriptive statistics
| Standard deviation | 90.60510972 |
|---|---|
| Coefficient of variation (CV) | 0.767013729 |
| Kurtosis | 6.747873494 |
| Mean | 118.1271029 |
| Median Absolute Deviation (MAD) | 43.92 |
| Skewness | 2.0531891 |
| Sum | 2172475.55 |
| Variance | 8209.285907 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 94 | 9 | < 0.1% |
| 33.81 | 7 | < 0.1% |
| 87.02 | 6 | < 0.1% |
| 39.46 | 6 | < 0.1% |
| 102.17 | 6 | < 0.1% |
| 109.67 | 6 | < 0.1% |
| 72.04 | 6 | < 0.1% |
| 20.53 | 6 | < 0.1% |
| 43.1 | 6 | < 0.1% |
| 84.08 | 6 | < 0.1% |
| Other values (12561) | 18327 | |
| (Missing) | 11140 |
| Value | Count | Frequency (%) |
| 0.01 | 1 | |
| 0.02 | 1 | |
| 0.03 | 1 | |
| 0.04 | 2 | |
| 0.06 | 1 | |
| 0.07 | 1 | |
| 0.13 | 2 | |
| 0.14 | 2 | |
| 0.16 | 1 | |
| 0.17 | 2 |
| Value | Count | Frequency (%) |
| 1000 | 1 | |
| 985 | 2 | |
| 917.08 | 1 | |
| 847.41 | 1 | |
| 802.87 | 1 | |
| 796.88 | 1 | |
| 768.16 | 1 | |
| 763.58 | 1 | |
| 761.91 | 1 | |
| 743.98 | 1 |
| Distinct | 5776 |
|---|---|
| Distinct (%) | 22.3% |
| Missing | 3582 |
| Missing (%) | 12.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 17.57472966 |
| Minimum | 0.02 |
|---|---|
| Maximum | 390.68 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 230.8 KiB |
Quantile statistics
| Minimum | 0.02 |
|---|---|
| 5-th percentile | 1.7 |
| Q1 | 5.63 |
| median | 9.89 |
| Q3 | 19.95 |
| 95-th percentile | 61.19 |
| Maximum | 390.68 |
| Range | 390.66 |
| Interquartile range (IQR) | 14.32 |
Descriptive statistics
| Standard deviation | 22.78584633 |
|---|---|
| Coefficient of variation (CV) | 1.296511911 |
| Kurtosis | 25.16434683 |
| Mean | 17.57472966 |
| Median Absolute Deviation (MAD) | 5.64 |
| Skewness | 3.883166275 |
| Sum | 456046.66 |
| Variance | 519.1947932 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 5.93 | 34 | 0.1% |
| 8.78 | 29 | 0.1% |
| 7.78 | 29 | 0.1% |
| 0.92 | 28 | 0.1% |
| 1.94 | 27 | 0.1% |
| 0.97 | 27 | 0.1% |
| 0.9 | 26 | 0.1% |
| 2.89 | 26 | 0.1% |
| 7.97 | 26 | 0.1% |
| 5.23 | 25 | 0.1% |
| Other values (5766) | 25672 | |
| (Missing) | 3582 | 12.1% |
| Value | Count | Frequency (%) |
| 0.02 | 7 | |
| 0.03 | 3 | |
| 0.06 | 2 | < 0.1% |
| 0.09 | 2 | < 0.1% |
| 0.1 | 1 | < 0.1% |
| 0.11 | 2 | < 0.1% |
| 0.12 | 1 | < 0.1% |
| 0.13 | 1 | < 0.1% |
| 0.14 | 1 | < 0.1% |
| 0.18 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 390.68 | 1 | |
| 382.44 | 1 | |
| 351.3 | 1 | |
| 304.26 | 1 | |
| 289.75 | 1 | |
| 288.55 | 1 | |
| 287.14 | 1 | |
| 273.39 | 1 | |
| 270.09 | 1 | |
| 268.03 | 1 |
| Distinct | 7404 |
|---|---|
| Distinct (%) | 28.5% |
| Missing | 3585 |
| Missing (%) | 12.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 28.56065906 |
| Minimum | 0.01 |
|---|---|
| Maximum | 362.21 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 230.8 KiB |
Quantile statistics
| Minimum | 0.01 |
|---|---|
| 5-th percentile | 4.93 |
| Q1 | 11.75 |
| median | 21.69 |
| Q3 | 37.62 |
| 95-th percentile | 74.125 |
| Maximum | 362.21 |
| Range | 362.2 |
| Interquartile range (IQR) | 25.87 |
Descriptive statistics
| Standard deviation | 24.4747458 |
|---|---|
| Coefficient of variation (CV) | 0.8569391114 |
| Kurtosis | 11.211125 |
| Mean | 28.56065906 |
| Median Absolute Deviation (MAD) | 11.42 |
| Skewness | 2.46455959 |
| Sum | 741034.86 |
| Variance | 599.0131818 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 10.58 | 24 | 0.1% |
| 9.42 | 23 | 0.1% |
| 9.14 | 18 | 0.1% |
| 9.47 | 17 | 0.1% |
| 10.09 | 17 | 0.1% |
| 9.24 | 17 | 0.1% |
| 10.21 | 17 | 0.1% |
| 7.14 | 17 | 0.1% |
| 9.44 | 17 | 0.1% |
| 13.9 | 16 | 0.1% |
| Other values (7394) | 25763 | |
| (Missing) | 3585 | 12.1% |
| Value | Count | Frequency (%) |
| 0.01 | 2 | < 0.1% |
| 0.02 | 5 | |
| 0.03 | 9 | |
| 0.04 | 2 | < 0.1% |
| 0.05 | 3 | < 0.1% |
| 0.06 | 3 | < 0.1% |
| 0.07 | 7 | |
| 0.08 | 5 | |
| 0.09 | 7 | |
| 0.1 | 4 |
| Value | Count | Frequency (%) |
| 362.21 | 1 | |
| 292.02 | 1 | |
| 277.31 | 1 | |
| 273.39 | 1 | |
| 266.46 | 1 | |
| 245.62 | 1 | |
| 241.34 | 1 | |
| 239.18 | 1 | |
| 239.1 | 1 | |
| 237.27 | 1 |
NOx
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONMISSINGZEROS| Distinct | 8156 |
|---|---|
| Distinct (%) | 32.2% |
| Missing | 4185 |
| Missing (%) | 14.2% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 32.30912333 |
| Minimum | 0 |
|---|---|
| Maximum | 467.63 |
| Zeros | 740 |
| Zeros (%) | 2.5% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 230.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 2.4 |
| Q1 | 12.82 |
| median | 23.52 |
| Q3 | 40.1275 |
| 95-th percentile | 96.3575 |
| Maximum | 467.63 |
| Range | 467.63 |
| Interquartile range (IQR) | 27.3075 |
Descriptive statistics
| Standard deviation | 31.64601094 |
|---|---|
| Coefficient of variation (CV) | 0.9794760016 |
| Kurtosis | 10.83633513 |
| Mean | 32.30912333 |
| Median Absolute Deviation (MAD) | 12.69 |
| Skewness | 2.569914617 |
| Sum | 818907.04 |
| Variance | 1001.470008 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 740 | 2.5% |
| 4.22 | 208 | 0.7% |
| 6.24 | 115 | 0.4% |
| 4.3 | 35 | 0.1% |
| 2.21 | 31 | 0.1% |
| 4.95 | 19 | 0.1% |
| 4.14 | 18 | 0.1% |
| 4.47 | 17 | 0.1% |
| 4.97 | 16 | 0.1% |
| 4.05 | 14 | < 0.1% |
| Other values (8146) | 24133 | |
| (Missing) | 4185 | 14.2% |
| Value | Count | Frequency (%) |
| 0 | 740 | |
| 0.03 | 4 | < 0.1% |
| 0.04 | 9 | < 0.1% |
| 0.05 | 3 | < 0.1% |
| 0.06 | 2 | < 0.1% |
| 0.07 | 2 | < 0.1% |
| 0.09 | 1 | < 0.1% |
| 0.1 | 3 | < 0.1% |
| 0.11 | 2 | < 0.1% |
| 0.12 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 467.63 | 1 | |
| 382.84 | 1 | |
| 378.31 | 1 | |
| 378.24 | 1 | |
| 302.78 | 1 | |
| 293.1 | 1 | |
| 289.09 | 1 | |
| 287.89 | 1 | |
| 273.33 | 1 | |
| 271.94 | 1 |
| Distinct | 5922 |
|---|---|
| Distinct (%) | 30.8% |
| Missing | 10328 |
| Missing (%) | 35.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 23.48347602 |
| Minimum | 0.01 |
|---|---|
| Maximum | 352.89 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 230.8 KiB |
Quantile statistics
| Minimum | 0.01 |
|---|---|
| 5-th percentile | 2.74 |
| Q1 | 8.58 |
| median | 15.85 |
| Q3 | 30.02 |
| 95-th percentile | 63.427 |
| Maximum | 352.89 |
| Range | 352.88 |
| Interquartile range (IQR) | 21.44 |
Descriptive statistics
| Standard deviation | 25.684275 |
|---|---|
| Coefficient of variation (CV) | 1.093716917 |
| Kurtosis | 27.9646081 |
| Mean | 23.48347602 |
| Median Absolute Deviation (MAD) | 9.25 |
| Skewness | 4.083993436 |
| Sum | 450953.19 |
| Variance | 659.6819821 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 6.29 | 36 | 0.1% |
| 6.32 | 29 | 0.1% |
| 6.3 | 28 | 0.1% |
| 6.31 | 28 | 0.1% |
| 6.28 | 27 | 0.1% |
| 6.27 | 24 | 0.1% |
| 10.46 | 23 | 0.1% |
| 6.59 | 22 | 0.1% |
| 6.33 | 21 | 0.1% |
| 6.6 | 21 | 0.1% |
| Other values (5912) | 18944 | |
| (Missing) | 10328 |
| Value | Count | Frequency (%) |
| 0.01 | 2 | < 0.1% |
| 0.02 | 6 | |
| 0.04 | 1 | < 0.1% |
| 0.05 | 1 | < 0.1% |
| 0.06 | 1 | < 0.1% |
| 0.08 | 2 | < 0.1% |
| 0.1 | 1 | < 0.1% |
| 0.11 | 4 | |
| 0.12 | 3 | |
| 0.13 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 352.89 | 1 | |
| 328.89 | 1 | |
| 323.48 | 1 | |
| 309.04 | 1 | |
| 303.53 | 1 | |
| 302.08 | 1 | |
| 301.28 | 1 | |
| 301.18 | 1 | |
| 297.64 | 1 | |
| 296.43 | 1 |
| Distinct | 1779 |
|---|---|
| Distinct (%) | 6.5% |
| Missing | 2059 |
| Missing (%) | 7.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.248598209 |
| Minimum | 0 |
|---|---|
| Maximum | 175.81 |
| Zeros | 2328 |
| Zeros (%) | 7.9% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 230.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0.51 |
| median | 0.89 |
| Q3 | 1.45 |
| 95-th percentile | 8.0245 |
| Maximum | 175.81 |
| Range | 175.81 |
| Interquartile range (IQR) | 0.94 |
Descriptive statistics
| Standard deviation | 6.962884254 |
|---|---|
| Coefficient of variation (CV) | 3.096544428 |
| Kurtosis | 109.4880503 |
| Mean | 2.248598209 |
| Median Absolute Deviation (MAD) | 0.44 |
| Skewness | 8.878321522 |
| Sum | 61773.49 |
| Variance | 48.48175714 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 2328 | 7.9% |
| 0.68 | 209 | 0.7% |
| 0.85 | 208 | 0.7% |
| 0.8 | 205 | 0.7% |
| 0.89 | 203 | 0.7% |
| 0.84 | 200 | 0.7% |
| 0.78 | 200 | 0.7% |
| 0.81 | 199 | 0.7% |
| 0.64 | 198 | 0.7% |
| 0.67 | 194 | 0.7% |
| Other values (1769) | 23328 | |
| (Missing) | 2059 | 7.0% |
| Value | Count | Frequency (%) |
| 0 | 2328 | |
| 0.01 | 59 | 0.2% |
| 0.02 | 59 | 0.2% |
| 0.03 | 56 | 0.2% |
| 0.04 | 30 | 0.1% |
| 0.05 | 48 | 0.2% |
| 0.06 | 42 | 0.1% |
| 0.07 | 40 | 0.1% |
| 0.08 | 34 | 0.1% |
| 0.09 | 38 | 0.1% |
| Value | Count | Frequency (%) |
| 175.81 | 1 | |
| 145.32 | 1 | |
| 134.85 | 1 | |
| 132.47 | 1 | |
| 132.07 | 1 | |
| 124.01 | 1 | |
| 119.68 | 1 | |
| 119.3 | 1 | |
| 118.02 | 1 | |
| 118 | 1 |
| Distinct | 4761 |
|---|---|
| Distinct (%) | 18.5% |
| Missing | 3854 |
| Missing (%) | 13.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 14.53197726 |
| Minimum | 0.01 |
|---|---|
| Maximum | 193.86 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 230.8 KiB |
Quantile statistics
| Minimum | 0.01 |
|---|---|
| 5-th percentile | 2.63 |
| Q1 | 5.67 |
| median | 9.16 |
| Q3 | 15.22 |
| 95-th percentile | 46.208 |
| Maximum | 193.86 |
| Range | 193.85 |
| Interquartile range (IQR) | 9.55 |
Descriptive statistics
| Standard deviation | 18.13377485 |
|---|---|
| Coefficient of variation (CV) | 1.247853236 |
| Kurtosis | 22.0671006 |
| Mean | 14.53197726 |
| Median Absolute Deviation (MAD) | 4.12 |
| Skewness | 4.083659555 |
| Sum | 373137.58 |
| Variance | 328.8337902 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 5.74 | 36 | 0.1% |
| 6.12 | 35 | 0.1% |
| 6.61 | 32 | 0.1% |
| 5.81 | 32 | 0.1% |
| 5.53 | 32 | 0.1% |
| 4.65 | 32 | 0.1% |
| 6.47 | 31 | 0.1% |
| 5.57 | 31 | 0.1% |
| 5.95 | 31 | 0.1% |
| 5.13 | 30 | 0.1% |
| Other values (4751) | 25355 | |
| (Missing) | 3854 | 13.1% |
| Value | Count | Frequency (%) |
| 0.01 | 1 | |
| 0.04 | 1 | |
| 0.21 | 1 | |
| 0.26 | 1 | |
| 0.36 | 1 | |
| 0.41 | 2 | |
| 0.42 | 1 | |
| 0.44 | 1 | |
| 0.48 | 1 | |
| 0.49 | 1 |
| Value | Count | Frequency (%) |
| 193.86 | 1 | |
| 187.02 | 1 | |
| 186.08 | 1 | |
| 182.39 | 1 | |
| 180.85 | 1 | |
| 179.18 | 1 | |
| 178.93 | 1 | |
| 178.63 | 1 | |
| 178.58 | 1 | |
| 176.88 | 1 |
| Distinct | 7699 |
|---|---|
| Distinct (%) | 30.2% |
| Missing | 4022 |
| Missing (%) | 13.6% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 34.49143048 |
| Minimum | 0.01 |
|---|---|
| Maximum | 257.73 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 230.8 KiB |
Quantile statistics
| Minimum | 0.01 |
|---|---|
| 5-th percentile | 7.02 |
| Q1 | 18.86 |
| median | 30.84 |
| Q3 | 45.57 |
| 95-th percentile | 74.142 |
| Maximum | 257.73 |
| Range | 257.72 |
| Interquartile range (IQR) | 26.71 |
Descriptive statistics
| Standard deviation | 21.69492819 |
|---|---|
| Coefficient of variation (CV) | 0.6289947356 |
| Kurtosis | 3.429464538 |
| Mean | 34.49143048 |
| Median Absolute Deviation (MAD) | 12.96 |
| Skewness | 1.330119322 |
| Sum | 879841.9 |
| Variance | 470.6699093 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 16.48 | 17 | 0.1% |
| 22.14 | 15 | 0.1% |
| 23.6 | 15 | 0.1% |
| 19.64 | 14 | < 0.1% |
| 18.33 | 14 | < 0.1% |
| 22.94 | 13 | < 0.1% |
| 13.14 | 13 | < 0.1% |
| 32.06 | 13 | < 0.1% |
| 19.68 | 13 | < 0.1% |
| 25.3 | 12 | < 0.1% |
| Other values (7689) | 25370 | |
| (Missing) | 4022 | 13.6% |
| Value | Count | Frequency (%) |
| 0.01 | 4 | |
| 0.02 | 7 | |
| 0.03 | 2 | < 0.1% |
| 0.04 | 3 | < 0.1% |
| 0.05 | 2 | < 0.1% |
| 0.06 | 3 | < 0.1% |
| 0.07 | 1 | < 0.1% |
| 0.1 | 8 | |
| 0.11 | 2 | < 0.1% |
| 0.12 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 257.73 | 1 | |
| 200.41 | 1 | |
| 193.31 | 1 | |
| 186.07 | 1 | |
| 177.07 | 1 | |
| 175.04 | 1 | |
| 172.28 | 1 | |
| 169.36 | 1 | |
| 169.35 | 1 | |
| 165.48 | 1 |
Benzene
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONMISSINGSKEWEDZEROS| Distinct | 1873 |
|---|---|
| Distinct (%) | 7.8% |
| Missing | 5623 |
| Missing (%) | 19.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.280840305 |
| Minimum | 0 |
|---|---|
| Maximum | 455.03 |
| Zeros | 3802 |
| Zeros (%) | 12.9% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 230.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0.12 |
| median | 1.07 |
| Q3 | 3.08 |
| 95-th percentile | 9.72 |
| Maximum | 455.03 |
| Range | 455.03 |
| Interquartile range (IQR) | 2.96 |
Descriptive statistics
| Standard deviation | 15.81113642 |
|---|---|
| Coefficient of variation (CV) | 4.81923378 |
| Kurtosis | 530.1714706 |
| Mean | 3.280840305 |
| Median Absolute Deviation (MAD) | 1.06 |
| Skewness | 21.30421849 |
| Sum | 78438.33 |
| Variance | 249.9920349 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 3802 | 12.9% |
| 0.03 | 300 | 1.0% |
| 0.02 | 292 | 1.0% |
| 0.01 | 217 | 0.7% |
| 0.04 | 190 | 0.6% |
| 0.05 | 176 | 0.6% |
| 0.09 | 170 | 0.6% |
| 2 | 170 | 0.6% |
| 0.1 | 167 | 0.6% |
| 0.08 | 157 | 0.5% |
| Other values (1863) | 18267 | |
| (Missing) | 5623 | 19.0% |
| Value | Count | Frequency (%) |
| 0 | 3802 | |
| 0.01 | 217 | 0.7% |
| 0.02 | 292 | 1.0% |
| 0.03 | 300 | 1.0% |
| 0.04 | 190 | 0.6% |
| 0.05 | 176 | 0.6% |
| 0.06 | 146 | 0.5% |
| 0.07 | 123 | 0.4% |
| 0.08 | 157 | 0.5% |
| 0.09 | 170 | 0.6% |
| Value | Count | Frequency (%) |
| 455.03 | 1 | |
| 454.85 | 1 | |
| 449.38 | 1 | |
| 448.59 | 1 | |
| 445.83 | 1 | |
| 443.63 | 1 | |
| 438.01 | 1 | |
| 435.9 | 1 | |
| 435.09 | 1 | |
| 432.94 | 1 |
Toluene
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONMISSINGZEROS| Distinct | 3608 |
|---|---|
| Distinct (%) | 16.8% |
| Missing | 8041 |
| Missing (%) | 27.2% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 8.70097208 |
| Minimum | 0 |
|---|---|
| Maximum | 454.85 |
| Zeros | 2861 |
| Zeros (%) | 9.7% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 230.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0.6 |
| median | 2.97 |
| Q3 | 9.15 |
| 95-th percentile | 33.92 |
| Maximum | 454.85 |
| Range | 454.85 |
| Interquartile range (IQR) | 8.55 |
Descriptive statistics
| Standard deviation | 19.96916366 |
|---|---|
| Coefficient of variation (CV) | 2.295049734 |
| Kurtosis | 216.7455066 |
| Mean | 8.70097208 |
| Median Absolute Deviation (MAD) | 2.94 |
| Skewness | 11.66612883 |
| Sum | 186983.89 |
| Variance | 398.7674972 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 2861 | 9.7% |
| 0.02 | 111 | 0.4% |
| 0.03 | 102 | 0.3% |
| 0.05 | 99 | 0.3% |
| 0.04 | 86 | 0.3% |
| 1.1 | 83 | 0.3% |
| 6 | 79 | 0.3% |
| 0.08 | 76 | 0.3% |
| 0.06 | 72 | 0.2% |
| 0.01 | 70 | 0.2% |
| Other values (3598) | 17851 | |
| (Missing) | 8041 |
| Value | Count | Frequency (%) |
| 0 | 2861 | |
| 0.01 | 70 | 0.2% |
| 0.02 | 111 | 0.4% |
| 0.03 | 102 | 0.3% |
| 0.04 | 86 | 0.3% |
| 0.05 | 99 | 0.3% |
| 0.06 | 72 | 0.2% |
| 0.07 | 61 | 0.2% |
| 0.08 | 76 | 0.3% |
| 0.09 | 54 | 0.2% |
| Value | Count | Frequency (%) |
| 454.85 | 1 | |
| 454.12 | 1 | |
| 449.14 | 1 | |
| 448.87 | 1 | |
| 445.84 | 1 | |
| 443.63 | 1 | |
| 437.77 | 1 | |
| 435.94 | 1 | |
| 434.92 | 1 | |
| 433.02 | 1 |
| Distinct | 1561 |
|---|---|
| Distinct (%) | 13.7% |
| Missing | 18109 |
| Missing (%) | 61.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.070127823 |
| Minimum | 0 |
|---|---|
| Maximum | 170.37 |
| Zeros | 1747 |
| Zeros (%) | 5.9% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 230.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0.14 |
| median | 0.98 |
| Q3 | 3.35 |
| 95-th percentile | 12.558 |
| Maximum | 170.37 |
| Range | 170.37 |
| Interquartile range (IQR) | 3.21 |
Descriptive statistics
| Standard deviation | 6.323247407 |
|---|---|
| Coefficient of variation (CV) | 2.059603955 |
| Kurtosis | 119.9801163 |
| Mean | 3.070127823 |
| Median Absolute Deviation (MAD) | 0.98 |
| Skewness | 7.891515254 |
| Sum | 35067 |
| Variance | 39.98345777 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 1747 | 5.9% |
| 0.1 | 255 | 0.9% |
| 2 | 142 | 0.5% |
| 0.65 | 120 | 0.4% |
| 0.12 | 108 | 0.4% |
| 0.11 | 93 | 0.3% |
| 0.15 | 80 | 0.3% |
| 0.13 | 80 | 0.3% |
| 0.16 | 77 | 0.3% |
| 0.52 | 76 | 0.3% |
| Other values (1551) | 8644 | |
| (Missing) | 18109 |
| Value | Count | Frequency (%) |
| 0 | 1747 | |
| 0.01 | 68 | 0.2% |
| 0.02 | 50 | 0.2% |
| 0.03 | 52 | 0.2% |
| 0.04 | 42 | 0.1% |
| 0.05 | 52 | 0.2% |
| 0.06 | 56 | 0.2% |
| 0.07 | 72 | 0.2% |
| 0.08 | 62 | 0.2% |
| 0.09 | 62 | 0.2% |
| Value | Count | Frequency (%) |
| 170.37 | 1 | |
| 137.45 | 1 | |
| 125.18 | 1 | |
| 116.62 | 1 | |
| 109.23 | 1 | |
| 105.76 | 1 | |
| 94.48 | 1 | |
| 89.7 | 1 | |
| 84.72 | 1 | |
| 81.26 | 1 |
| Distinct | 829 |
|---|---|
| Distinct (%) | 3.3% |
| Missing | 4681 |
| Missing (%) | 15.9% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 166.4635815 |
| Minimum | 13 |
|---|---|
| Maximum | 2049 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 230.8 KiB |
Quantile statistics
| Minimum | 13 |
|---|---|
| 5-th percentile | 50 |
| Q1 | 81 |
| median | 118 |
| Q3 | 208 |
| 95-th percentile | 407 |
| Maximum | 2049 |
| Range | 2036 |
| Interquartile range (IQR) | 127 |
Descriptive statistics
| Standard deviation | 140.6965851 |
|---|---|
| Coefficient of variation (CV) | 0.8452094076 |
| Kurtosis | 21.42372711 |
| Mean | 166.4635815 |
| Median Absolute Deviation (MAD) | 48 |
| Skewness | 3.396757198 |
| Sum | 4136620 |
| Variance | 19795.52906 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 102 | 223 | 0.8% |
| 100 | 222 | 0.8% |
| 70 | 208 | 0.7% |
| 106 | 208 | 0.7% |
| 78 | 198 | 0.7% |
| 98 | 195 | 0.7% |
| 66 | 192 | 0.7% |
| 104 | 192 | 0.7% |
| 80 | 190 | 0.6% |
| 92 | 187 | 0.6% |
| Other values (819) | 22835 | |
| (Missing) | 4681 | 15.9% |
| Value | Count | Frequency (%) |
| 13 | 1 | < 0.1% |
| 14 | 3 | < 0.1% |
| 15 | 3 | < 0.1% |
| 16 | 4 | < 0.1% |
| 17 | 7 | < 0.1% |
| 18 | 2 | < 0.1% |
| 19 | 27 | |
| 20 | 29 | |
| 21 | 7 | < 0.1% |
| 22 | 8 | < 0.1% |
| Value | Count | Frequency (%) |
| 2049 | 1 | |
| 1917 | 1 | |
| 1842 | 1 | |
| 1747 | 1 | |
| 1719 | 1 | |
| 1672 | 1 | |
| 1646 | 1 | |
| 1630 | 1 | |
| 1613 | 1 | |
| 1595 | 1 |
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 4681 |
| Missing (%) | 15.9% |
| Memory size | 230.8 KiB |
| Moderate | |
|---|---|
| Satisfactory | |
| Poor | |
| Very Poor | |
| Good |
Length
| Max length | 12 |
|---|---|
| Median length | 8 |
| Mean length | 8.646639839 |
| Min length | 4 |
Characters and Unicode
| Total characters | 214869 |
|---|---|
| Distinct characters | 18 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Poor |
|---|---|
| 2nd row | Very Poor |
| 3rd row | Severe |
| 4th row | Severe |
| 5th row | Severe |
Common Values
| Value | Count | Frequency (%) |
| Moderate | 8829 | |
| Satisfactory | 8224 | |
| Poor | 2781 | 9.4% |
| Very Poor | 2337 | 7.9% |
| Good | 1341 | 4.5% |
| Severe | 1338 | 4.5% |
| (Missing) | 4681 |
Length
Pie chart
| Value | Count | Frequency (%) |
| moderate | 8829 | |
| satisfactory | 8224 | |
| poor | 5118 | |
| very | 2337 | 8.6% |
| good | 1341 | 4.9% |
| severe | 1338 | 4.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| o | 29971 | |
| r | 25846 | |
| a | 25277 | |
| t | 25277 | |
| e | 24009 | |
| y | 10561 | 4.9% |
| d | 10170 | 4.7% |
| S | 9562 | 4.5% |
| M | 8829 | 4.1% |
| c | 8224 | 3.8% |
| Other values (8) | 37143 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 185345 | |
| Uppercase Letter | 27187 | 12.7% |
| Space Separator | 2337 | 1.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| o | 29971 | |
| r | 25846 | |
| a | 25277 | |
| t | 25277 | |
| e | 24009 | |
| y | 10561 | 5.7% |
| d | 10170 | 5.5% |
| c | 8224 | 4.4% |
| s | 8224 | 4.4% |
| f | 8224 | 4.4% |
| Other values (2) | 9562 | 5.2% |
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 9562 | |
| M | 8829 | |
| P | 5118 | |
| V | 2337 | 8.6% |
| G | 1341 | 4.9% |
Space Separator
| Value | Count | Frequency (%) |
| 2337 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 212532 | |
| Common | 2337 | 1.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| o | 29971 | |
| r | 25846 | |
| a | 25277 | |
| t | 25277 | |
| e | 24009 | |
| y | 10561 | 5.0% |
| d | 10170 | 4.8% |
| S | 9562 | 4.5% |
| M | 8829 | 4.2% |
| c | 8224 | 3.9% |
| Other values (7) | 34806 |
Common
| Value | Count | Frequency (%) |
| 2337 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 214869 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| o | 29971 | |
| r | 25846 | |
| a | 25277 | |
| t | 25277 | |
| e | 24009 | |
| y | 10561 | 4.9% |
| d | 10170 | 4.7% |
| S | 9562 | 4.5% |
| M | 8829 | 4.1% |
| c | 8224 | 3.8% |
| Other values (8) | 37143 |
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| City | Date | PM2.5 | PM10 | NO | NO2 | NOx | NH3 | CO | SO2 | O3 | Benzene | Toluene | Xylene | AQI | AQI_Bucket | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Ahmedabad | 2015-01-01 | NaN | NaN | 0.92 | 18.22 | 17.15 | NaN | 0.92 | 27.64 | 133.36 | 0.00 | 0.02 | 0.00 | NaN | NaN |
| 1 | Ahmedabad | 2015-01-02 | NaN | NaN | 0.97 | 15.69 | 16.46 | NaN | 0.97 | 24.55 | 34.06 | 3.68 | 5.50 | 3.77 | NaN | NaN |
| 2 | Ahmedabad | 2015-01-03 | NaN | NaN | 17.40 | 19.30 | 29.70 | NaN | 17.40 | 29.07 | 30.70 | 6.80 | 16.40 | 2.25 | NaN | NaN |
| 3 | Ahmedabad | 2015-01-04 | NaN | NaN | 1.70 | 18.48 | 17.97 | NaN | 1.70 | 18.59 | 36.08 | 4.43 | 10.14 | 1.00 | NaN | NaN |
| 4 | Ahmedabad | 2015-01-05 | NaN | NaN | 22.10 | 21.42 | 37.76 | NaN | 22.10 | 39.33 | 39.31 | 7.01 | 18.89 | 2.78 | NaN | NaN |
| 5 | Ahmedabad | 2015-01-06 | NaN | NaN | 45.41 | 38.48 | 81.50 | NaN | 45.41 | 45.76 | 46.51 | 5.42 | 10.83 | 1.93 | NaN | NaN |
| 6 | Ahmedabad | 2015-01-07 | NaN | NaN | 112.16 | 40.62 | 130.77 | NaN | 112.16 | 32.28 | 33.47 | 0.00 | 0.00 | 0.00 | NaN | NaN |
| 7 | Ahmedabad | 2015-01-08 | NaN | NaN | 80.87 | 36.74 | 96.75 | NaN | 80.87 | 38.54 | 31.89 | 0.00 | 0.00 | 0.00 | NaN | NaN |
| 8 | Ahmedabad | 2015-01-09 | NaN | NaN | 29.16 | 31.00 | 48.00 | NaN | 29.16 | 58.68 | 25.75 | 0.00 | 0.00 | 0.00 | NaN | NaN |
| 9 | Ahmedabad | 2015-01-10 | NaN | NaN | NaN | 7.04 | 0.00 | NaN | NaN | 8.29 | 4.55 | 0.00 | 0.00 | 0.00 | NaN | NaN |
Last rows
| City | Date | PM2.5 | PM10 | NO | NO2 | NOx | NH3 | CO | SO2 | O3 | Benzene | Toluene | Xylene | AQI | AQI_Bucket | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 29521 | Visakhapatnam | 2020-06-22 | 33.17 | 108.22 | 5.58 | 42.45 | 27.06 | 13.70 | 0.73 | 13.65 | 34.85 | 3.99 | 10.24 | 2.32 | 95.0 | Satisfactory |
| 29522 | Visakhapatnam | 2020-06-23 | 25.40 | 83.38 | 2.76 | 34.09 | 19.92 | 13.13 | 0.54 | 10.40 | 43.27 | 2.88 | 12.03 | 1.33 | 100.0 | Satisfactory |
| 29523 | Visakhapatnam | 2020-06-24 | 34.36 | 90.90 | 1.22 | 23.38 | 13.12 | 14.45 | 0.56 | 10.92 | 35.12 | 2.99 | 3.15 | 1.60 | 86.0 | Satisfactory |
| 29524 | Visakhapatnam | 2020-06-25 | 13.45 | 58.54 | 2.30 | 21.60 | 13.09 | 12.27 | 0.41 | 8.19 | 29.38 | 1.28 | 5.64 | 0.92 | 77.0 | Satisfactory |
| 29525 | Visakhapatnam | 2020-06-26 | 7.63 | 32.27 | 5.91 | 23.27 | 17.19 | 11.15 | 0.46 | 6.87 | 19.90 | 1.45 | 5.37 | 1.45 | 47.0 | Good |
| 29526 | Visakhapatnam | 2020-06-27 | 15.02 | 50.94 | 7.68 | 25.06 | 19.54 | 12.47 | 0.47 | 8.55 | 23.30 | 2.24 | 12.07 | 0.73 | 41.0 | Good |
| 29527 | Visakhapatnam | 2020-06-28 | 24.38 | 74.09 | 3.42 | 26.06 | 16.53 | 11.99 | 0.52 | 12.72 | 30.14 | 0.74 | 2.21 | 0.38 | 70.0 | Satisfactory |
| 29528 | Visakhapatnam | 2020-06-29 | 22.91 | 65.73 | 3.45 | 29.53 | 18.33 | 10.71 | 0.48 | 8.42 | 30.96 | 0.01 | 0.01 | 0.00 | 68.0 | Satisfactory |
| 29529 | Visakhapatnam | 2020-06-30 | 16.64 | 49.97 | 4.05 | 29.26 | 18.80 | 10.03 | 0.52 | 9.84 | 28.30 | 0.00 | 0.00 | 0.00 | 54.0 | Satisfactory |
| 29530 | Visakhapatnam | 2020-07-01 | 15.00 | 66.00 | 0.40 | 26.85 | 14.05 | 5.20 | 0.59 | 2.10 | 17.05 | NaN | NaN | NaN | 50.0 | Good |